Load Packages

For our analysis, in addition to the state-wise data package, we will use the following packages for data wrangling and visualization.

library(devtools)
library(tidyverse)
library(sf)
library(tidycensus)
library(tmap)
library(censusr)
library(here)
library(janitor)
library(lubridate)

Load Data

The data comes from a data set posted by Open Data DC from their own website that consists of all the crime data for past years. The data can be downloaded in .csv, .geojson and various other formats. The data is available at (“https://opendata.dc.gov/datasets/crime-incidents-in-2021/explore”) and is updated daily.

The datasets for census data and the crime data were joined together using

df = st_join(crime, neigh)

T-map for crimes in neighborhood

The tmap code below has been set to view mode. Each neighborhood has been represented by a a different shade and been filled with the area. The colored dots represents the type of crime that has happened in that particular area.

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(neigh) + 
  tm_polygons("shapearea") +
tm_shape(crime) + 
  tm_dots(col="offense", palette = "Set1", stretch.palette = FALSE, size = 0.02, shape = 2) +  
  tm_layout(legend.outside = TRUE) 
Symbol shapes other than circles or icons are not supported in view mode.

Here is the first 10 rows of a table representing the number of crimes for each neighborhood code. Taking the max we can see that the neighborhood with the code N31 has the most crime reports.

df2 <- df %>%
count (code)
df2
Simple feature collection with 52 features and 2 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: -77.11093 ymin: 38.8194 xmax: -76.91001 ymax: 38.99422
Geodetic CRS:  WGS 84
First 10 features:
   code    n                       geometry
1    N0   65 MULTIPOINT ((-77.04495 38.8...
2    N1  731 MULTIPOINT ((-77.02459 38.9...
3   N10  244 MULTIPOINT ((-77.0814 38.92...
4   N11   80 MULTIPOINT ((-77.07418 38.9...
5   N12 1583 MULTIPOINT ((-77.02091 38.8...
6   N13 1209 MULTIPOINT ((-77.03649 38.9...
7   N14  284 MULTIPOINT ((-76.97694 38.8...
8   N15   58 MULTIPOINT ((-77.00607 38.9...
9   N16  263 MULTIPOINT ((-76.96936 38.8...
10  N17  490 MULTIPOINT ((-76.95282 38.9...
max = df2 %>% slice_max(n) %>% slice(1)
max
Simple feature collection with 1 feature and 2 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: -77.04274 ymin: 38.90356 xmax: -77.01618 ymax: 38.91612
Geodetic CRS:  WGS 84
  code    n                       geometry
1  N31 1846 MULTIPOINT ((-77.02192 38.9...

Additionally, here is a similar table but taking into account the type of crime committed.

df3 <- df %>%
count (code, offense )
df3
Simple feature collection with 390 features and 3 fields
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -77.11093 ymin: 38.8194 xmax: -76.91001 ymax: 38.99422
Geodetic CRS:  WGS 84
First 10 features:
   code                    offense  n                       geometry
1    N0 ASSAULT W/DANGEROUS WEAPON  2 MULTIPOINT ((-77.03281 38.8...
2    N0                   BURGLARY  1     POINT (-77.02091 38.89305)
3    N0                   HOMICIDE  1     POINT (-77.00704 38.89202)
4    N0        MOTOR VEHICLE THEFT  9 MULTIPOINT ((-77.02091 38.8...
5    N0                    ROBBERY  2     POINT (-77.00839 38.89721)
6    N0                  SEX ABUSE  2     POINT (-77.00839 38.89721)
7    N0               THEFT F/AUTO 28 MULTIPOINT ((-77.02207 38.8...
8    N0                THEFT/OTHER 20 MULTIPOINT ((-77.04495 38.8...
9    N1 ASSAULT W/DANGEROUS WEAPON 28 MULTIPOINT ((-77.0244 38.93...
10   N1                   BURGLARY 36 MULTIPOINT ((-77.0244 38.93...

Also here is a table containing the most prominent crime in each neighborhood and their number of occurrences.

max = df3 %>%
  group_by(code)%>%
  slice_max(n)
max
Simple feature collection with 55 features and 3 fields
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -77.10904 ymin: 38.8194 xmax: -76.91162 ymax: 38.99422
Geodetic CRS:  WGS 84

From this table you can see that the most common offense in all the neighborhood is theft/other.

Making Visualizations from the Data

The crime data from Open Data DC is from Jan 1, 2021 till Dec 8, 2021. We used some of the variables from the dataset to create plots that could give us certain visualizations about crime happening in DC.

Time of the day

The ggplot shows us visualizations for crime by time of day.

ggplot(crime, aes(fct_infreq(shift))) +
  geom_bar() +
  labs(
    x="Time of Day",
    y="Number of Crimes Reported")+
  theme(legend.position = "none")

Method

The ggplot shows us visualizations for crime by method.

ggplot(crime, aes(x=fct_infreq(method))) +
  geom_bar() +
  labs(
    x="Method of Crime",
    y="Number of Crimes Reported",
    fill="Method") +
  scale_y_continuous(labels = scales::comma)+
  theme(legend.position = "none")

Offense

ggplot code below shows the number of crimes by offense. In the plot below, we see that there are no records for Arson related crimes. However, that is because the crimes related to Arson very less (3) compared to that of Theft/Other (10236).

ggplot(crime, aes(x=fct_infreq(offense), fill=offense)) +
  geom_bar(stat='count') +
  labs(
    x="Criminal Offense",
    y="Number of Crimes Reported",
    fill="Offense") +
    theme(axis.text.x = element_text(angle = 30, hjust=1)) +
    theme(legend.position = "none")

Total Crimes (by month)

  • Data for December is incomplete.
ggplot(crime, aes(factor(month(report_dat, label=TRUE)))) +
  geom_bar() +
  labs(
    x="Year",
    y="Number of Crimes Reported") +
    theme(legend.position = "none")

LS0tDQp0aXRsZTogIkNyaW1lIERhdGEgQW5hbHlzaXMiDQphdXRob3I6ICJUZWFtIDUgLSBDcmltZSBEYXRhIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyMjIExvYWQgUGFja2FnZXMNCg0KRm9yIG91ciBhbmFseXNpcywgaW4gYWRkaXRpb24gdG8gdGhlIHN0YXRlLXdpc2UgZGF0YSBwYWNrYWdlLCB3ZSB3aWxsIHVzZSB0aGUgZm9sbG93aW5nIHBhY2thZ2VzIGZvciBkYXRhIHdyYW5nbGluZyBhbmQgdmlzdWFsaXphdGlvbi4NCg0KLSAqKmRldnRvb2xzKiogZm9yIFIgcGFja2FnZSBkZXZlbG9wbWVudA0KLSAqKnRpZHl2ZXJzZSoqIGZvciBkYXRhIHdyYW5nbGluZyBhbmQgdmlzdWFsaXphdGlvbg0KLSAqKnNmKiogdG8gY29udmVydCBjb29yZGluYXRlcyBpbnRvIGdlb21ldHJpY2FsIHBvaW50cw0KLSAqKnRpZHljZW5zdXMqKiB0byBnZXQgQ2Vuc3VzIGRhdGEgdGhhdCBpcyBwcmUtcHJlcGFyZWQgZm9yIGV4cGxvcmF0aW9uIHdpdGhpbiB0aGUgdGlkeXZlcnNlDQotICoqdG1hcCoqIHRvIHZpc3VhbGl6ZSBzcGF0aWFsIGRhdGEgZGlzdHJpYnV0aW9ucyBieSBjcmVhdGluZyBnZW9ncmFwaGljYWwgbWFwcyBieSB2YXJpYWJsZXMNCi0gKipjZW5zdXNyKiogdG8gcmV0cmlldmUgZGF0YSBmcmFtZXMgb2YgY2Vuc3VzIGRhdGEgYW5kIG1ldGFkYXRhIGZyb20gVS5TLiBDZW5zdXMgQmV1cmVhdQ0KLSAqKmhlcmUqKiB0byBzcGVjaWZ5IHdoZXJlIGZpbGVzL2RhdGEgaXMgaW4gdGhlIGNvbXB1dGVyIHJlbGF0aXZlIHRvIGEgcGFydGljdWxhciBmaWxlIA0KLSAqKmphbml0b3IqKiBmb3IgZXhhbWluaW5nIGFuZCBjbGVhbmluZyB1bnRpZHkgZGF0YQ0KLSAqKmx1YnJpZGF0ZSoqIHBhY2thZ2UgZm9yIGhhbmRsaW5nIGRhdGVzDQoNCmBgYHtyfQ0KbGlicmFyeShkZXZ0b29scykNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShzZikNCmxpYnJhcnkodGlkeWNlbnN1cykNCmxpYnJhcnkodG1hcCkNCmxpYnJhcnkoY2Vuc3VzcikNCmxpYnJhcnkoaGVyZSkNCmxpYnJhcnkoamFuaXRvcikNCmxpYnJhcnkobHVicmlkYXRlKQ0KYGBgDQoNCiMjIyBMb2FkIERhdGENCg0KVGhlIGRhdGEgY29tZXMgZnJvbSBhIGRhdGEgc2V0IHBvc3RlZCBieSBPcGVuIERhdGEgREMgZnJvbSB0aGVpciBvd24gd2Vic2l0ZSB0aGF0IGNvbnNpc3RzIG9mIGFsbCB0aGUgY3JpbWUgZGF0YSBmb3IgcGFzdCB5ZWFycy4gVGhlIGRhdGEgY2FuIGJlIGRvd25sb2FkZWQgaW4gLmNzdiwgLmdlb2pzb24gYW5kIHZhcmlvdXMgb3RoZXIgZm9ybWF0cy4gVGhlIGRhdGEgaXMgYXZhaWxhYmxlIGF0ICgiaHR0cHM6Ly9vcGVuZGF0YS5kYy5nb3YvZGF0YXNldHMvY3JpbWUtaW5jaWRlbnRzLWluLTIwMjEvZXhwbG9yZSIpIGFuZCBpcyB1cGRhdGVkIGRhaWx5Lg0KDQpgYGB7ciwgaW5jbHVkZSA9IEZBTFNFfQ0KY3JpbWUgPC0gc3RfcmVhZCgNCiAgaGVyZSgic3VibWlzc2lvbnMiLCJiaWtlX3NoYXJlIiwiZGF0YSIsIkNyaW1lX0luY2lkZW50c19pbl8yMDIxLmdlb2pzb24iKSkgJT4lIGNsZWFuX25hbWVzKCkNCg0KbmVpZ2ggPSBzdF9yZWFkKA0KICBoZXJlKCJzdWJtaXNzaW9ucyIsImJpa2Vfc2hhcmUiLCJkYXRhIiwiZGNfbmVpZ2guZ2VvanNvbiIpKSAlPiUgY2xlYW5fbmFtZXMoKQ0KYGBgDQoNClRoZSBkYXRhc2V0cyBmb3IgY2Vuc3VzIGRhdGEgYW5kIHRoZSBjcmltZSBkYXRhIHdlcmUgam9pbmVkIHRvZ2V0aGVyIHVzaW5nDQoNCmBgYHtyfQ0KZGYgPSBzdF9qb2luKGNyaW1lLCBuZWlnaCkNCmBgYA0KDQojIyMgVC1tYXAgZm9yIGNyaW1lcyBpbiBuZWlnaGJvcmhvb2QNCg0KVGhlIHRtYXAgY29kZSBiZWxvdyBoYXMgYmVlbiBzZXQgdG8gdmlldyBtb2RlLiBFYWNoIG5laWdoYm9yaG9vZCBoYXMgYmVlbiByZXByZXNlbnRlZCBieSBhIGEgZGlmZmVyZW50IHNoYWRlIGFuZCBiZWVuIGZpbGxlZCB3aXRoIHRoZSBhcmVhLiBUaGUgY29sb3JlZCBkb3RzIHJlcHJlc2VudHMgdGhlIHR5cGUgb2YgY3JpbWUgdGhhdCBoYXMgaGFwcGVuZWQgaW4gdGhhdCBwYXJ0aWN1bGFyIGFyZWEuDQoNCmBgYHtyfQ0KdG1hcF9tb2RlKCJ2aWV3IikNCg0KdG1fc2hhcGUobmVpZ2gpICsgDQogIHRtX3BvbHlnb25zKCJzaGFwZWFyZWEiKSArDQp0bV9zaGFwZShjcmltZSkgKyANCiAgdG1fZG90cyhjb2w9Im9mZmVuc2UiLCBwYWxldHRlID0gIlNldDEiLCBzdHJldGNoLnBhbGV0dGUgPSBGQUxTRSwgc2l6ZSA9IDAuMDIsIHNoYXBlID0gMikgKyAgDQogIHRtX2xheW91dChsZWdlbmQub3V0c2lkZSA9IFRSVUUpIA0KYGBgDQoNCkhlcmUgaXMgdGhlIGZpcnN0IDEwIHJvd3Mgb2YgYSB0YWJsZSByZXByZXNlbnRpbmcgdGhlIG51bWJlciBvZiBjcmltZXMgZm9yIGVhY2ggbmVpZ2hib3Job29kIGNvZGUuIFRha2luZyB0aGUgbWF4IHdlIGNhbiBzZWUgdGhhdCB0aGUgbmVpZ2hib3Job29kIHdpdGggdGhlIGNvZGUgTjMxIGhhcyB0aGUgbW9zdCBjcmltZSByZXBvcnRzLg0KDQpgYGB7cn0NCmRmMiA8LSBkZiAlPiUNCmNvdW50IChjb2RlKQ0KZGYyDQoNCm1heCA9IGRmMiAlPiUgc2xpY2VfbWF4KG4pICU+JSBzbGljZSgxKQ0KbWF4DQpgYGANCg0KQWRkaXRpb25hbGx5LCBoZXJlIGlzIGEgc2ltaWxhciB0YWJsZSBidXQgdGFraW5nIGludG8gYWNjb3VudCB0aGUgdHlwZSBvZiBjcmltZSBjb21taXR0ZWQuDQpgYGB7cn0NCmRmMyA8LSBkZiAlPiUNCmNvdW50IChjb2RlLCBvZmZlbnNlICkNCmRmMw0KYGBgDQpBbHNvIGhlcmUgaXMgYSB0YWJsZSBjb250YWluaW5nIHRoZSBtb3N0IHByb21pbmVudCBjcmltZSBpbiBlYWNoIG5laWdoYm9yaG9vZCBhbmQgdGhlaXIgbnVtYmVyIG9mIG9jY3VycmVuY2VzLg0KYGBge3J9DQptYXggPSBkZjMgJT4lDQogIGdyb3VwX2J5KGNvZGUpJT4lDQogIHNsaWNlX21heChuKQ0KbWF4DQpgYGANCg0KRnJvbSB0aGlzIHRhYmxlIHlvdSBjYW4gc2VlIHRoYXQgdGhlIG1vc3QgY29tbW9uIG9mZmVuc2UgaW4gYWxsIHRoZSBuZWlnaGJvcmhvb2QgaXMgdGhlZnQvb3RoZXIuDQoNCiMjIyBNYWtpbmcgVmlzdWFsaXphdGlvbnMgZnJvbSB0aGUgRGF0YQ0KDQpUaGUgY3JpbWUgZGF0YSBmcm9tIE9wZW4gRGF0YSBEQyBpcyBmcm9tIEphbiAxLCAyMDIxIHRpbGwgRGVjIDgsIDIwMjEuIFdlIHVzZWQgc29tZSBvZiB0aGUgdmFyaWFibGVzIGZyb20gdGhlIGRhdGFzZXQgdG8gY3JlYXRlIHBsb3RzIHRoYXQgY291bGQgZ2l2ZSB1cyBjZXJ0YWluIHZpc3VhbGl6YXRpb25zIGFib3V0IGNyaW1lIGhhcHBlbmluZyBpbiBEQy4NCg0KIyMjIyBUaW1lIG9mIHRoZSBkYXkNCg0KVGhlIGdncGxvdCBzaG93cyB1cyB2aXN1YWxpemF0aW9ucyBmb3IgY3JpbWUgYnkgdGltZSBvZiBkYXkuIA0KDQpgYGB7cn0NCmdncGxvdChjcmltZSwgYWVzKGZjdF9pbmZyZXEoc2hpZnQpKSkgKw0KICBnZW9tX2JhcigpICsNCiAgbGFicygNCiAgICB4PSJUaW1lIG9mIERheSIsDQogICAgeT0iTnVtYmVyIG9mIENyaW1lcyBSZXBvcnRlZCIpKw0KICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpDQpgYGANCg0KIyMjIyBNZXRob2QNCg0KVGhlIGdncGxvdCBzaG93cyB1cyB2aXN1YWxpemF0aW9ucyBmb3IgY3JpbWUgYnkgbWV0aG9kLg0KDQpgYGB7cn0NCmdncGxvdChjcmltZSwgYWVzKHg9ZmN0X2luZnJlcShtZXRob2QpKSkgKw0KICBnZW9tX2JhcigpICsNCiAgbGFicygNCiAgICB4PSJNZXRob2Qgb2YgQ3JpbWUiLA0KICAgIHk9Ik51bWJlciBvZiBDcmltZXMgUmVwb3J0ZWQiLA0KICAgIGZpbGw9Ik1ldGhvZCIpICsNCiAgc2NhbGVfeV9jb250aW51b3VzKGxhYmVscyA9IHNjYWxlczo6Y29tbWEpKw0KICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpDQpgYGANCg0KIyMjIyBPZmZlbnNlDQoNCmBnZ3Bsb3RgIGNvZGUgYmVsb3cgc2hvd3MgdGhlIG51bWJlciBvZiBjcmltZXMgYnkgb2ZmZW5zZS4gDQpJbiB0aGUgcGxvdCBiZWxvdywgd2Ugc2VlIHRoYXQgdGhlcmUgYXJlIG5vIHJlY29yZHMgZm9yIEFyc29uIHJlbGF0ZWQgY3JpbWVzLiBIb3dldmVyLCB0aGF0IGlzIGJlY2F1c2UgdGhlIGNyaW1lcyByZWxhdGVkIHRvIEFyc29uIHZlcnkgbGVzcyAoMykgY29tcGFyZWQgdG8gdGhhdCBvZiBUaGVmdC9PdGhlciAoMTAyMzYpLg0KDQpgYGB7cn0NCmdncGxvdChjcmltZSwgYWVzKHg9ZmN0X2luZnJlcShvZmZlbnNlKSwgZmlsbD1vZmZlbnNlKSkgKw0KICBnZW9tX2JhcihzdGF0PSdjb3VudCcpICsNCiAgbGFicygNCiAgICB4PSJDcmltaW5hbCBPZmZlbnNlIiwNCiAgICB5PSJOdW1iZXIgb2YgQ3JpbWVzIFJlcG9ydGVkIiwNCiAgICBmaWxsPSJPZmZlbnNlIikgKw0KICAgIHRoZW1lKGF4aXMudGV4dC54ID0gZWxlbWVudF90ZXh0KGFuZ2xlID0gMzAsIGhqdXN0PTEpKSArDQogICAgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQ0KYGBgDQoNCiMjIyMgVG90YWwgQ3JpbWVzIChieSBtb250aCkNCg0KKiBEYXRhIGZvciBEZWNlbWJlciBpcyBpbmNvbXBsZXRlLg0KDQpgYGB7cn0NCmdncGxvdChjcmltZSwgYWVzKGZhY3Rvcihtb250aChyZXBvcnRfZGF0LCBsYWJlbD1UUlVFKSkpKSArDQogIGdlb21fYmFyKCkgKw0KICBsYWJzKA0KICAgIHg9IlllYXIiLA0KICAgIHk9Ik51bWJlciBvZiBDcmltZXMgUmVwb3J0ZWQiKSArDQogICAgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQ0KYGBgDQo=